SAILoR: Structure-Aware Inference of Logic Rules

Boolean networks provide an effective mechanism for describing interactions and dynamics of gene regulatory networks (GRNs). Deriving accurate Boolean descriptions of GRNs is a challenging task. The number of experiments is usually much smaller than the number of genes. In addition, binarization leads to a loss of information and inconsistencies arise in binarized time-series data. The inference of Boolean networks from binarized time-series data alone often leads to complex and overfitted models. To obtain relevant Boolean models of gene regulatory networks, inference methods could incorporate data from multiple sources and prior knowledge in terms of general network structure and/or exact interactions. We propose the Boolean network inference method SAILoR (Structure-Aware Inference of Logic Rules). SAILoR incorporates time-series gene expression data in combination with provided reference networks to infer accurate Boolean models. SAILoR automatically extracts topological properties from reference networks. These can describe a more general structure of the GRN or can be more precise and describe specific interactions. SAILoR infers a Boolean network by learning from both continuous and binarized time-series data. It navigates between two main objectives, topological similarity to reference networks and correspondence with gene expression data. By incorporating the NSGA-II multi-objective genetic algorithm, SAILoR relies on the wisdom of crowds. Our results indicate that SAILoR can infer accurate and biologically relevant Boolean descriptions of GRNs from both a static and a dynamic perspective. We show that SAILoR improves the static accuracy of the inferred network compared to the network inference method dynGENIE3. Furthermore, we compared the performance of SAILoR with other Boolean network inference approaches including Best-Fit, REVEAL, MIBNI, GABNI, ATEN, and LogBTF. We have shown that by incorporating prior knowledge about the overall network structure, SAILoR can improve the structural correctness of the inferred Boolean networks while maintaining dynamic accuracy. To demonstrate the applicability of SAILoR, we inferred context-specific Boolean subnetworks of female Drosophila melanogaster before and after mating.

Please state what role the funders took in the study.If the funders had no role, please state: "The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript."If this statement is not correct you must amend it as needed.
Please include this amended Role of Funder statement in your cover letter; we will change the online submission form on your behalf.

Response:
We would like to clarify that the funders had no role in the study design, data collection and analysis, decision to publish, or preparation of the manuscript.
As instructed, we included the Role of the Funder statement in the cover letter.

General comments:
The study presents a method to infer gene regulatory systems described by Boolean networks.It is generally well-written, except for some minor grammatical improvements on the abstract and typographical , so that it would be appreciated more by readers.

Response:
Thank you for your positive comments.We have thoroughly revised the manuscript to correct grammatical and typographical errors, and improved phraseology.

Comment #1:
How did you arrive at the limit of the number if regulators to at most 10?The authors have mentioned that this is justified by the scale-free property of networks, but it is not intuitive.Please include the explanation in the manuscript.

Response:
Thank you for your comment.In our manuscript, we included additional justification, where we refer to recent reviews and analyses of Boolean biological models.These reviews show that a negligible fraction of logic functions consist of 10 or more regulators.For example, Mitra et al. analyzed three datasets of curated Boolean models in which only 60 out of 8,871 Boolean functions have more than 10 inputs.In addition, we analyzed the in-degree distribution of logic functions inferred with SAILoR (see Fig 11).Boolean functions of evaluated networks derived with SAILoR consist mainly of at most 5 regulators.The in-degree distribution is slightly more skewed to the right for logic functions from networks with 64 nodes, however, only 0.4% (25 of 6,400) of such Boolean functions have seven or more distinct inputs.

Comment #2:
Were the results be different (ignoring the computational overhead) if the threshold number of regulators is different from 10?

Response:
Our findings indicate that constraining the number of regulators to 10 has no impact on the dynamic accuracy of networks inferred with SAILoR.However, this value could be revisited when inferring larger Boolean models, as we explained in the conclusion.

Comment #3
Have you checked various sources re linked role of clk and cyc in the downstream regulation of other clock genes?Note that reliability of regulation must also be checked.

Response:
Thank you for your comment.Based on your observations, we extended Section 4.4 in our manuscript.
"Both networks differ in many interactions, signifying the importance of a proper context when inferring Boolean networks.For example, consider the positive interaction of Clk and cry from the network (V) (see Fig 12).Protein CRY has been linked to circadian rhythmicity as a blue light photoreceptor dedicated to mediating TIM degradation [66] and to resetting of circadian rhythms [67].However, in the network (M) this interaction is reversed (see Fig 13).This further indicates that the mating disrupts the circadian rhythm of Drosophila.While the known negative regulation of tim by cry is missing from both networks, SAILoR still identified indirect positive regulation of Clk through double repression, since TIM-PER heterodimer inhibits CLK-CYC activity.We must also note, that SAILoR unsuccessfully identified the linked role of Clk and cyc in the downstream regulation of other clock genes [68].For example, CLK and CYC directly activate transcription of per and tim [66].Nonetheless, SAILoR still identified the combined role of CLK-CYC dimer through joint regulation of retn in (V) network and through the indirect regulation of Gadd45 in (M) network.The Dead ringer protein (RETN) is implicated as a major repressor of male courtship behavior [69].In (M) network retn is regulated by per.Additionally, RACK1, an essential receptor at multiple steps of Drosophila development, particularly in oogenesis [70], is in the network (M) heavily regulated.However, Rack1 was reduced to a constant 1 in network (V).While the relationship between the molecular clock genes and the regulation of female receptivity and regulation of egg-laying behavior has not been yet completely explained, it has been shown, that the altered circadian expression impacts metabolic and neuronal features [39]."

Reviewer #2: General comments:
The propsed method names as SAILOR uses time-series gene expression data and incorporates prior knowledge of network structure to infer Boolean representations of gene regulatory networks (GRNs).However, SAILoR's current implementation is impractical for larger problems due to space and time complexity.It is best suited for small to medium-sized networks but could be extended with the aid of other methods for larger networks.A limitation is that reference networks should be of similar size to the network being inferred.The paper is well-organized, starting with an introduction of the key challenges, then providing necessary background, and describing the approach and experiments in detail.The writing clearly explains the techniques and results.Relevant work is cited when introducing existing methods and concepts, positioning their contributions in the context of the state-of-the-art.Despite this, I believe that there are several aspects that must be reviewed before the work is acceptable for publication

Response:
Thank you for your review and for your encouraging comments, which we have thoroughly attended to.

Comment #1:
The use of Boolean network is a key aspect of this work.However, the state of the art on this topic is not analyzed in depth in the introduction.

Response:
Thank you for your comment.We extended the introduction and included additional references, namely

Comment #2:
Moreover, since there are various models to solve GRN, I suggest to the authors to review some recent methods like: